Learning to Handle Inconsistency for Multi-Source Integration

نویسندگان

  • Sheila Tejada
  • Craig A. Knoblock
  • Steven Minton
چکیده

Many problems arise when trying to integrate information from multiple sources on the web. One of these problems is that data instances can exist in inconsistent formats across several sources. An example application of information integration is trying to integrate all the reviews of Los Angeles restaurants from Yahoo’s Restaurants webpage with the current health rating for each restaurant from the LA County Department of Health’s website. Integrating these sources requires determining if they share any of the same restaurants by comparing the data instances from both sources (Figure 1). Because the instances can be in different formats, e.g. the restaurant “Jerry’s Famous Deli” from Yahoo’s webpage can appear as “Jerry’s Famous Delicatessen” in the Dept. of Health’s source, they can not be compared using equality; but must be judged according to similarity.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Online Q-learning Based Multi-Agent LFC for a Multi-Area Multi-Source Power System Including Distributed Energy Resources

This paper presents an online two-stage Q-learning based multi-agent (MA) controller for load frequency control (LFC) in an interconnected multi-area multi-source power system integrated with distributed energy resources (DERs). The proposed control strategy consists of two stages. The first stage is employed a PID controller which its parameters are designed using sine cosine optimization (SCO...

متن کامل

Geo-Web Service Tool for Spatial Data Integrability

The integration of multi-source heterogeneous spatial data is one of the major challenges for many spatial data users. Users put much effort to identify and overcome inconsistency among data sets through a timeconsuming and costly process. Spatial applications that rely on multi-source heterogeneous data also suffer from the lack of automatic mechanism to identify the inconsistency items and as...

متن کامل

MMDT: Multi-Objective Memetic Rule Learning from Decision Tree

In this article, a Multi-Objective Memetic Algorithm (MA) for rule learning is proposed. Prediction accuracy and interpretation are two measures that conflict with each other. In this approach, we consider accuracy and interpretation of rules sets. Additionally, individual classifiers face other problems such as huge sizes, high dimensionality and imbalance classes’ distribution data sets. This...

متن کامل

A Solution for Data Inconsistency in Data Integration

Data integration is a problem of combining data residing at different sources and providing the user with a unified view of these data. An important issue in data integration is the possibility of conflicts among the different data sources. Data sources may conflict with each other at data value level which is defined as data inconsistency. So in this paper, a solution for data inconsistency in...

متن کامل

Statement of interest: inconsistency-tolerance in data integration systems

The task of a data integration system is to combine the data residing at different, autonomous sources, and providing the user with a unified view of these data, called global schema. Users query the global schema, while the system carries out the task of suitably accessing different sources and assembling the data retrieved at each source into the final answer to the query. Since sources are i...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999